Extracting Molecular Binding Relationships from Biomedical Text
نویسندگان
چکیده
ARBITER is a Prolog program that extracts assertions about macromolecular binding relationships from biomedical text. We describe the domain knowledge and the underspecified linguistic analyses that support the identification of these predications. After discussing a formal evaluation of ARBITER, we report on its application to 491,000 MEDLINE ~ abstracts, during which almost 25,000 binding relationships suitable for entry into a database of macromolecular function were extracted.
منابع مشابه
Learning the Structure of Biomedical Relationships from Unstructured Text
The published biomedical research literature encompasses most of our understanding of how drugs interact with gene products to produce physiological responses (phenotypes). Unfortunately, this information is distributed throughout the unstructured text of over 23 million articles. The creation of structured resources that catalog the relationships between drugs and genes would accelerate the tr...
متن کاملMining molecular binding terminology from biomedical text
Automatic access to information regarding macromolecular binding relationships would provide a valuable resource to the biomedical community. We report on a pilot project to mine such information from the molecular biology literature. The program being developed takes advantage of natural language processing techniques and is supported by two repositories of biomolecular knowledge. A formative ...
متن کاملA Framework for Schema-Driven Relationship Discovery from Unstructured Text
We address the issue of extracting implicit and explicit relationships between entities in biomedical text. We argue that entities seldom occur in text in their simple form and that relationships in text relate the modified, complex forms of entities with each other. We present a rule-based method for (1) extraction of such complex entities and (2) relationships between them and (3) the convers...
متن کاملSubsequence Kernels for Relation Extraction
We present a new kernel method for extracting semantic relations between entities in natural language text, based on a generalization of subsequence kernels. This kernel uses three types of subsequence patterns that are typically employed in natural language to assert relationships between two entities. Experiments on extracting protein interactions from biomedical corpora and top-level relatio...
متن کاملParts-of-Speech Tagger Errors Do Not Necessarily Degrade Accuracy in Extracting Information from Biomedical Text
Background: An ongoing assessment of the literature is difficult with the rapidly increasing volume of research publications and limited effective information extraction tools which identify entity relationships from text. A recent study reported development of Muscorian, a generic text processing tool for extracting proteinprotein interactions from text that achieved comparable performance to ...
متن کامل